AITopics | performance estimator

Collaborating Authors

performance estimator

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Robust Batch-Level Query Routing for Large Language Models under Cost and Capacity Constraints

Markovic-Voronov, Jelena, Behdin, Kayhan, Xu, Yuanda, Zhou, Zhengze, Wang, Zhipeng, Mazumder, Rahul

arXiv.org Machine LearningMar-31-2026

We study the problem of routing queries to large language models (LLMs) under cost, GPU resources, and concurrency constraints. Prior per-query routing methods often fail to control batch-level cost, especially under non-uniform or adversarial batching. To address this, we propose a batch-level, resource-aware routing framework that jointly optimizes model assignment for each batch while respecting cost and model capacity limits. We further introduce a robust variant that accounts for uncertainty in predicted LLM performance, along with an offline instance allocation procedure that balances quality and throughput across multiple models. Experiments on two multi-task LLM benchmarks show that robustness improves accuracy by 1-14% over non-robust counterparts (depending on the performance estimator), batch-level routing outperforms per-query methods by up to 24% under adversarial batching, and optimized instance allocation yields additional gains of up to 3% compared to a non-optimized allocation, all while strictly controlling cost and GPU resource constraints.

dataset1, large language model, machine learning, (20 more...)

arXiv.org Machine Learning

2603.26796

Genre: Research Report (0.82)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.98)

Add feedback

Evaluating Efficient Performance Estimators of Neural Architectures

Neural Information Processing SystemsDec-24-2025, 05:36:43 GMT

Conducting efficient performance estimations of neural architectures is a major challenge in neural architecture search (NAS). To reduce the architecture training costs in NAS, one-shot estimators (OSEs) amortize the architecture training costs by sharing the parameters of one supernet between all architectures. Recently, zero-shot estimators (ZSEs) that involve no training are proposed to further reduce the architecture evaluation cost. Despite the high efficiency of these estimators, the quality of such estimations has not been thoroughly studied. In this paper, we conduct an extensive and organized assessment of OSEs and ZSEs on five NAS benchmarks: NAS-Bench-101/201/301, and NDS ResNet/ResNeXt-A. Specifically, we employ a set of NAS-oriented criteria to study the behavior of OSEs and ZSEs, and reveal their biases and variances. After analyzing how and why the OSE estimations are unsatisfying, we explore how to mitigate the correlation gap of OSEs from three perspectives. Through our analysis, we give out suggestions for future application and development of efficient architecture performance estimators. Furthermore, the analysis framework proposed in our work could be utilized in future research to give a more comprehensive understanding of newly designed architecture performance estimators.

efficient performance estimator, name change, performance estimator, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.66)

Add feedback

Interaction Concordance Index: Performance Evaluation for Interaction Prediction Methods

Pahikkala, Tapio, Numminen, Riikka, Movahedi, Parisa, Karmitsa, Napsu, Airola, Antti

arXiv.org Machine LearningOct-17-2025

Consider two sets of entities and their members' mutual affinity values, say drug-target affinities (DTA). Drugs and targets are said to interact in their effects on DTAs if drug's effect on it depends on the target. Presence of interaction implies that assigning a drug to a target and another drug to another target does not provide the same aggregate DTA as the reversed assignment would provide. Accordingly, correctly capturing interactions enables better decision-making, for example, in allocation of limited numbers of drug doses to their best matching targets. Learning to predict DTAs is popularly done from either solely from known DTAs or together with side information on the entities, such as chemical structures of drugs and targets. In this paper, we introduce interaction directions' prediction performance estimator we call interaction concordance index (IC-index), for both fixed predictors and machine learning algorithms aimed for inferring them. IC-index complements the popularly used DTA prediction performance estimators by evaluating the ratio of correctly predicted directions of interaction effects in data. First, we show the invariance of IC-index on predictors unable to capture interactions. Secondly, we show that learning algorithm's permutation equivariance regarding drug and target identities implies its inability to capture interactions when either drug, target or both are unseen during training. In practical applications, this equivariance is remedied via incorporation of appropriate side information on drugs and targets. We make a comprehensive empirical evaluation over several biomedical interaction data sets with various state-of-the-art machine learning algorithms. The experiments demonstrate how different types of affinity strength prediction methods perform in terms of IC-index complementing existing prediction performance estimators.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

2510.14419

Country:

Europe > Finland > Southwest Finland > Turku (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.92)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)

Add feedback

Evaluating Efficient Performance Estimators of Neural Architectures

Neural Information Processing SystemsAug-22-2025, 00:21:19 GMT

Furthermore, the analysis framework proposed in our work could be utilized in future research to give a more comprehensive understanding of newly designed architecture performance estimators.

architecture, architecture search, search space, (14 more...)

Neural Information Processing Systems

Country: Asia > China > Beijing > Beijing (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.86)

Add feedback

Are Recommenders Self-Aware? Label-Free Recommendation Performance Estimation via Model Uncertainty

Li, Jiayu, Ye, Ziyi, Jian, Guohao, Guo, Zhiqiang, Ma, Weizhi, Ai, Qingyao, Zhang, Min

arXiv.org Artificial IntelligenceAug-1-2025

Can a recommendation model be self-aware? This paper investigates the recommender's self-awareness by quantifying its uncertainty, which provides a label-free estimation of its performance. Such self-assessment can enable more informed understanding and decision-making before the recommender engages with any users. To this end, we propose an intuitive and effective method, probability-based List Distribution uncertainty (LiDu). LiDu measures uncertainty by determining the probability that a recommender will generate a certain ranking list based on the prediction distributions of individual items. We validate LiDu's ability to represent model self-awareness in two settings: (1) with a matrix factorization model on a synthetic dataset, and (2) with popular recommendation algorithms on real-world datasets. Experimental results show that LiDu is more correlated with recommendation performance than a series of label-free performance estimators. Additionally, LiDu provides valuable insights into the dynamic inner states of models throughout training and inference. This work establishes an empirical connection between recommendation uncertainty and performance, framing it as a step towards more transparent and self-evaluating recommender systems.

artificial intelligence, machine learning, recommender, (18 more...)

arXiv.org Artificial Intelligence

2507.23208

Country: Asia > China (0.15)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

MERGE$^3$: Efficient Evolutionary Merging on Consumer-grade GPUs

Mencattini, Tommaso, Minut, Adrian Robert, Crisostomi, Donato, Santilli, Andrea, Rodolà, Emanuele

arXiv.org Artificial IntelligenceFeb-9-2025

Evolutionary model merging enables the creation of high-performing multi-task models but remains computationally prohibitive for consumer hardware. We introduce MERGE$^3$, an efficient framework that makes evolutionary merging feasible on a single GPU by reducing fitness computation costs 50$\times$ while preserving performance. MERGE$^3$ achieves this by Extracting a reduced dataset for evaluation, Estimating model abilities using Item Response Theory (IRT), and Evolving optimal merges via IRT-based performance estimators. Our method enables state-of-the-art multilingual and cross-lingual merging, transferring knowledge across languages with significantly lower computational overhead. We provide theoretical guarantees and an open-source library, democratizing high-quality model merging.

evolutionary algorithm, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2502.10436

Country:

Asia > Middle East > Jordan (0.04)
South America > Colombia > Meta Department > Villavicencio (0.04)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Genre: Research Report > New Finding (0.93)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.67)

Add feedback

Black-box Optimization with Simultaneous Statistical Inference for Optimal Performance

Lian, Teng, Hu, Jian-Qiang, Wu, Yuhang, Zheng, Zeyu

arXiv.org Machine LearningJan-13-2025

Black-box optimization is often encountered for decision-making in complex systems management, where the knowledge of system is limited. Under these circumstances, it is essential to balance the utilization of new information with computational efficiency. In practice, decision-makers often face the dual tasks of optimization and statistical inference for the optimal performance, in order to achieve it with a high reliability. Our goal is to address the dual tasks in an online fashion. Wu et al (2022) [arXiv preprint: 2210.06737] point out that the sample average of performance estimates generated by the optimization algorithm needs not to admit a central limit theorem. We propose an algorithm that not only tackles this issue, but also provides an online consistent estimator for the variance of the performance. Furthermore, we characterize the convergence rate of the coverage probabilities of the asymptotic confidence intervals.

algorithm, estimator, statistical inference, (14 more...)

arXiv.org Machine Learning

2501.07795

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Asia > China (0.04)

Genre: Research Report (0.64)

Industry: Transportation > Air (0.61)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

GNNavigator: Towards Adaptive Training of Graph Neural Networks via Automatic Guideline Exploration

Qiao, Tong, Yang, Jianlei, Qi, Yingjie, Zhou, Ao, Bai, Chen, Yu, Bei, Zhao, Weisheng, Hu, Chunming

arXiv.org Artificial IntelligenceApr-15-2024

Graph Neural Networks (GNNs) succeed significantly in many applications recently. However, balancing GNNs training runtime cost, memory consumption, and attainable accuracy for various applications is non-trivial. Previous training methodologies suffer from inferior adaptability and lack a unified training optimization solution. To address the problem, this work proposes GNNavigator, an adaptive GNN training configuration optimization framework. GNNavigator meets diverse GNN application requirements due to our unified software-hardware co-abstraction, proposed GNNs training performance model, and practical design space exploration solution. Experimental results show that GNNavigator can achieve up to 3.1x speedup and 44.9% peak memory reduction with comparable accuracy to state-of-the-art approaches.

gnnavigator, optimization, proceedings, (14 more...)

arXiv.org Artificial Intelligence

2404.09544

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > China (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

OmniBoost: Boosting Throughput of Heterogeneous Embedded Devices under Multi-DNN Workload

Karatzas, Andreas, Anagnostopoulos, Iraklis

arXiv.org Artificial IntelligenceJul-6-2023

Modern Deep Neural Networks (DNNs) exhibit profound efficiency and accuracy properties. This has introduced application workloads that comprise of multiple DNN applications, raising new challenges regarding workload distribution. Equipped with a diverse set of accelerators, newer embedded system present architectural heterogeneity, which current run-time controllers are unable to fully utilize. To enable high throughput in multi-DNN workloads, such a controller is ought to explore hundreds of thousands of possible solutions to exploit the underlying heterogeneity. In this paper, we propose OmniBoost, a lightweight and extensible multi-DNN manager for heterogeneous embedded devices. We leverage stochastic space exploration and we combine it with a highly accurate performance estimator to observe a x4.6 average throughput boost compared to other state-of-the-art methods. The evaluation was performed on the HiKey970 development board.

artificial intelligence, machine learning, workload, (20 more...)

arXiv.org Artificial Intelligence

2307.0329

Country:

Africa > Mali (0.04)
North America > United States > Illinois (0.04)
Europe (0.04)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Architecture > Embedded Systems (0.92)

Add feedback

Filters

Collaborating Authors

performance estimator

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Robust Batch-Level Query Routing for Large Language Models under Cost and Capacity Constraints

65d90fc6d307590b14e9e1800d4e8eab-Paper.pdf

Evaluating Efficient Performance Estimators of Neural Architectures

Interaction Concordance Index: Performance Evaluation for Interaction Prediction Methods

Evaluating Efficient Performance Estimators of Neural Architectures

Are Recommenders Self-Aware? Label-Free Recommendation Performance Estimation via Model Uncertainty

MERGE$^3$: Efficient Evolutionary Merging on Consumer-grade GPUs

Black-box Optimization with Simultaneous Statistical Inference for Optimal Performance

GNNavigator: Towards Adaptive Training of Graph Neural Networks via Automatic Guideline Exploration

OmniBoost: Boosting Throughput of Heterogeneous Embedded Devices under Multi-DNN Workload